Setting up a session
# Loaded packages with their versions
sessionInfo() # to check your locale
## R version 4.3.1 (2023-06-16)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Sonoma 14.5
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: Europe/Kiev
## tzcode source: internal
##
## attached base packages:
## [1] grid stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] gridExtra_2.3 sf_1.0-15 easyclimate_0.2.1 lubridate_1.9.3
## [5] forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4 purrr_1.0.2
## [9] readr_2.1.5 tidyr_1.3.1 tibble_3.2.1 ggplot2_3.5.0
## [13] tidyverse_2.0.0
##
## loaded via a namespace (and not attached):
## [1] sass_0.4.9 utf8_1.2.4 generics_0.1.3 class_7.3-22
## [5] KernSmooth_2.23-21 stringi_1.8.3 hms_1.1.3 digest_0.6.35
## [9] magrittr_2.0.3 evaluate_0.23 timechange_0.2.0 fastmap_1.1.1
## [13] jsonlite_1.8.8 e1071_1.7-14 DBI_1.2.2 fansi_1.0.6
## [17] scales_1.3.0 jquerylib_0.1.4 cli_3.6.2 rlang_1.1.3
## [21] units_0.8-5 munsell_0.5.1 withr_3.0.0 cachem_1.0.8
## [25] yaml_2.3.8 tools_4.3.1 tzdb_0.4.0 colorspace_2.1-0
## [29] vctrs_0.6.5 R6_2.5.1 proxy_0.4-27 lifecycle_1.0.4
## [33] classInt_0.4-10 pkgconfig_2.0.3 pillar_1.9.0 bslib_0.7.0
## [37] gtable_0.3.4 glue_1.7.0 Rcpp_1.0.12 xfun_0.43
## [41] tidyselect_1.2.1 rstudioapi_0.16.0 knitr_1.45 htmltools_0.5.8.1
## [45] rmarkdown_2.26 compiler_4.3.1
Since we have nights (“days”) with very few bats captured, we decided to drop the nights with lesser bats captured than the threshold.
Check how many bats we captured during the night, per territory.
For each bat captured, we measured two growth-related numerical variables - forearm length (Ra, mm), and weight (W, g).
What the statistical distribution of response variables?
Distribution in both response variables diviate from normality.
How Ra related to W?
Bats with bigger forearm (Ra) have higher mass (W), and vice versa. Atthe same time, there is a visible differences among years.
Is there any sex differences in growth metrics? Is that possible that male and female bats growth different?
##
## Shapiro-Wilk normality test
##
## data: df$Ra
## W = 0.99056, p-value = 0.0001243
##
## Bartlett test of homogeneity of variances
##
## data: Ra by sex
## Bartlett's K-squared = 0.21128, df = 1, p-value = 0.6458
##
## Shapiro-Wilk normality test
##
## data: df$W
## W = 0.98967, p-value = 5.125e-05
##
## Bartlett test of homogeneity of variances
##
## data: W by sex
## Bartlett's K-squared = 0.82962, df = 1, p-value = 0.3624
To select a method for assessing sex dependent variance, we first checked homogeneity of variances and normality. Bartlett test indicated good homogeneity for both Ra and W, but Shapiro-Wilk test indicated deviation from normality. On the other hand, visual exploration suggest highly symmetrical distribution. Thus, we tested differences between male and female bats using t-test:
Ra appeared to be sex-dependent, Weight did not show such pattern.
To reveal possible interactions with sex, we then took a look at the sex-dependent variation of Ra/W across the years of sampling.
In our data set, male/female ration in Ra seemed to be constant across the years of survey - male young bats had always smaller arm than female ones. Contrary, W did not show a constant pattern across the year.
Therefore, to test our hypothesis about spring weather effect on bat growth, we must fit models that include sex as either nominative explanatory variables with interactions or random effect.
Bat survey (inventory) lasted approximately two weeks in July, after all of the juveniles turn to be able to fly. Therefore, we can expect some directed changes in bat body measures over this period. We tested the relations between growth metrics (Ra and W) and day-of-the-year, controlling different years.
Ra did not seen to change over the survey period, whereas W constantly increased. If we compare study years, slopes of simple linear models slightly varies, as well as intercepts, but remains close to the horizontal (Ra seems to vary indifferently either year or capture date). By contrast, W shows constant slopes over the years, but differs in intercepts. In other words,
We also check if some sex-dependent differences in W exist in each year.
To numerically check different growth metrics’ variation during inventory time, we fit a simple linear models per years.
## [[1]]
## Coefficient p_value sign_level
## 2008 0.03251899 0.16228879
## 2009 0.03996433 0.09580011
## 2011 -0.01839296 0.42644586
## 2014 0.13586352 0.02530808 *
## 2019 0.07414264 0.01202252 *
##
## [[2]]
Ra is weakly related to the time of capture.
## [[1]]
## Coefficient p_value sign_level
## 2008 0.3838443 6.407741e-16 *
## 2009 0.3115096 6.578883e-12 *
## 2011 0.3305665 1.171298e-12 *
## 2014 0.3179832 5.773326e-03 *
## 2019 0.2251565 3.862218e-06 *
##
## [[2]]
W strongly related to the time of capture.
As we can see, Ra generally remains constant over survey period, and probably doesn’t differ among years. Weight, instead, (i) differs among years, and (ii) grows over the survey period.
In other words, bats captured later in a year might have higher weight, regardless of spring weather conditions. To minimize an effect of time within inventory period, we calculated a synthetic, capture-time-independent growth metrics called Body Mass Index (BMI), and test its variation over the inventory period and between the years. For each bat capture event, BMI was calculated as the residuals from a linear regression of the logarithmically transformed weight (W) against the logarithm of the day-of-the-year (day).
Log-transformation scales BMI from -1 to 1, which facilitates its interpretation. Inspired by Mikryukov et al. 2023, and, partly, Scaled mass index - SMI by Peig and Green 2009
Check BMI’s sensitivity to day-of-the-year:
As we can see on the Figure above, BMI isn’t sensitive to bat capture date within a survey period, but reflects differences among years.
## [[1]]
## Coefficient p_value sign_level
## 2008 0.0027457969 0.1305312
## 2009 -0.0007378799 0.6700145
## 2011 0.0009869243 0.6133391
## 2014 0.0012891446 0.8093400
## 2019 -0.0025366965 0.2552311
##
## [[2]]
## `geom_smooth()` using formula = 'y ~ x'
Similarly, BMI shows no statistically significant dependency on capture date (day-of-the-year), even when we treat it for each year separately.
Therefore, we choose to test two alternative response variables - forearm length (Ra) and Body Mass Index (BMI), as a possible growth metric in our research.
# Take a look at the timespan
summary(df$date)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## "2008-07-02" "2009-07-08" "2011-07-05" "2011-07-23" "2011-07-17" "2019-07-18"
Here we explore the relationship among bat forearm length, weight, and Body Mass Index and simplest possible aggregated climatic variables - average mean temperature, average minimal temperature, and cumulative precipitation. Time period for aggregation - one calendar month.
daily %>%
mutate(month = lubridate::month(date)) %>%
mutate(Year = as.factor(lubridate::year(date))) %>%
group_by(place_id, Year, month) %>%
summarise(sum(Prcp), mean(Tmean), mean(Tmin)) %>%
rename(Prcp = 4, Tmean = 5, Tmin = 6) -> aggregated_clim
## `summarise()` has grouped output by 'place_id', 'Year'. You can override using
## the `.groups` argument.
df %>%
select(Territory, place_id, Year, Ra, W, BMI) %>%
left_join(aggregated_clim) -> monthly_clim
## Joining with `by = join_by(place_id, Year)`
rm(aggregated_clim)
# Ra
monthly_clim %>%
filter(month %in% c(1:6)) %>%
ggplot(aes(x = Tmean, y = Ra)) +
geom_point(shape = 1) +
# geom_smooth(method = "lm") +
# geom_smooth() +
stat_smooth(method = glm) +
facet_wrap(vars(month), scales = "free")
## `geom_smooth()` using formula = 'y ~ x'
monthly_clim %>%
filter(month %in% c(1:6)) %>%
ggplot(aes(x = Tmin, y = Ra)) +
geom_point(shape = 1) +
# geom_smooth(method = "lm") +
# geom_smooth() +
stat_smooth(method = glm) +
facet_wrap(vars(month), scales = "free")
## `geom_smooth()` using formula = 'y ~ x'
monthly_clim %>%
filter(month %in% c(1:6)) %>%
ggplot(aes(x = Prcp, y = Ra)) +
geom_point(shape = 1) +
# geom_smooth(method = "lm") +
# geom_smooth() +
stat_smooth(method = glm) +
facet_wrap(vars(month), scales = "free")
## `geom_smooth()` using formula = 'y ~ x'
# W
monthly_clim %>%
filter(month %in% c(1:6)) %>%
ggplot(aes(x = Tmean, y = W)) +
geom_point(shape = 1) +
geom_smooth(method = "lm") +
facet_wrap(vars(month), scales = "free")
## `geom_smooth()` using formula = 'y ~ x'
monthly_clim %>%
filter(month %in% c(1:6)) %>%
ggplot(aes(x = Tmin, y = W)) +
geom_point(shape = 1) +
geom_smooth(method = "lm") +
facet_wrap(vars(month), scales = "free")
## `geom_smooth()` using formula = 'y ~ x'
monthly_clim %>%
filter(month %in% c(1:6)) %>%
ggplot(aes(x = Prcp, y = W)) +
geom_point(shape = 1) +
geom_smooth(method = "lm") +
facet_wrap(vars(month), scales = "free")
## `geom_smooth()` using formula = 'y ~ x'
# BMI
monthly_clim %>%
filter(month %in% c(1:6)) %>%
ggplot(aes(x = Tmean, y = BMI)) +
geom_point(shape = 1) +
geom_smooth(method = "lm") +
facet_wrap(vars(month), scales = "free")
## `geom_smooth()` using formula = 'y ~ x'
monthly_clim %>%
filter(month %in% c(1:6)) %>%
ggplot(aes(x = Tmin, y = BMI)) +
geom_point(shape = 1) +
geom_smooth(method = "lm") +
facet_wrap(vars(month), scales = "free")
## `geom_smooth()` using formula = 'y ~ x'
monthly_clim %>%
filter(month %in% c(1:6)) %>%
ggplot(aes(x = Prcp, y = BMI)) +
geom_point(shape = 1) +
geom_smooth(method = "lm") +
facet_wrap(vars(month), scales = "free")
## `geom_smooth()` using formula = 'y ~ x'
# Vignettes
# library(climwin)
#
# vignette("climwin", package = "climwin")
# vignette("advanced_climwin", package = "climwin")
# Load necessary libraries
library(climwin)
library(lme4)
# Set priors
cdate <- daily$date # date var in climate data
bdate <- df$date # date var in biological data
# Relative or absolute time response.
# If "absolute", reference day as day, month is needed
type <- "absolute"
refday <- c(01, 07)
# If "relative", time window calculated for each individual value
# type <- "relative"
# refday <- NULL
# Temporal resolution of climate data
# cinterval <- "day"
cinterval <- "week"
# Upper and lower limit for tested climate windows respectively
# Must correspond to the resolution chosen in ‘cinterval’
# if 'cinterval' set to "days"
# range <- c(250, 0)
# if 'cinterval' set to "week"
# range <- c(30, 0) # (~ 1 Dec)
range <- c(23, 0) # ~ 15 Jan
stat <- "mean"
func <- "lin"
# number of repeats for randomization
# repeats <- 5 # for metric = "C"
repeats <- 200 # for metric = "AIC"
# Metrics for assessing
# metric = "C"
metric = "AIC"
# Number of years in data
sample.size = nlevels(as.factor(as.vector(df$Year)))
Here we explore two temperature predictors: Mean (Tmean) and Minimal (Tmin) daily temperatures, averaged across time window.
# Set priors
# List of dependent variable candidates
xvar <- list(Tmean = daily$Tmean, Tmin = daily$Tmin)
# Null model with no climate signal
baseline <- lm(Ra ~ 1, data = df)
Candidate model fitting
# Fit candidate model set
# RaWin <- slidingwin(xvar = xvar,
# cdate = daily$date,
# bdate = df$date,
# baseline = baseline,
# cinterval = cinterval,
# range = range,
# type = type, refday = refday,
# stat = stat,
# func = func,
# spatial = list(df$place_id, daily$place_id))
#
# save(RaWin, file = paste0(modeldir, "RaWin.Rdata"))
# rm(RaWin)
# gc()
#
# Fit randomized model set for evaluation purposes
# RaRand <- randwin(repeats = repeats,
# xvar = xvar,
# cdate = cdate,
# bdate = bdate,
# baseline = baseline,
# cinterval = cinterval,
# range = range,
# type = type, refday = refday,
# stat = stat,
# func = func,
# spatial = list(df$place_id, daily$place_id))
#
# save(RaRand, file = paste0(modeldir, "RaRand.Rdata"))
# rm(RaRand)
# gc()
Model diagnostics
load(file = paste0(modeldir, "RaWin.Rdata"))
# Possible combinations of model parameters
RaWin$combos
## response climate type stat func DeltaAICc WindowOpen WindowClose
## 1 Ra Tmean absolute mean lin -8.75 7 5
## 2 Ra Tmin absolute mean lin -8.27 21 21
candidate_models <- list()
for (i in 1:(length(RaWin)-1)) {
candidate_models[[i]] <- head(RaWin[[i]]$Dataset)
}
candidate_models
## [[1]]
## deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ
## 31 -8.754531 7 5 -0.11783871 0.03582217 NA
## 60 -8.176073 10 6 -0.18569572 0.05804150 NA
## 23 -8.133348 6 5 -0.07812802 0.02447159 NA
## 232 -8.068939 21 21 0.03628371 0.01140142 NA
## 73 -8.025189 11 5 -0.17088155 0.05381371 NA
## 61 -7.833059 10 5 -0.14672686 0.04665852 NA
## ModelBetaC ModelInt Function Furthest Closest Statistics Type K
## 31 NA 56.02267 lin 23 0 mean absolute 0
## 60 NA 56.60265 lin 23 0 mean absolute 0
## 23 NA 55.43276 lin 23 0 mean absolute 0
## 232 NA 54.26605 lin 23 0 mean absolute 0
## 73 NA 56.34719 lin 23 0 mean absolute 0
## 61 NA 56.16855 lin 23 0 mean absolute 0
## ModWeight sample.size Reference.day Reference.month Randomised
## 31 0.05504871 80 1 7 no
## 60 0.04122271 80 1 7 no
## 23 0.04035143 80 1 7 no
## 232 0.03907264 80 1 7 no
## 73 0.03822721 80 1 7 no
## 61 0.03472579 80 1 7 no
##
## [[2]]
## deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ
## 232 -8.271720 21 21 0.03133884 0.009749385 NA
## 31 -8.040529 7 5 -0.14073998 0.044287529 NA
## 23 -7.885157 6 5 -0.09745162 0.030907018 NA
## 25 -7.876671 6 3 -0.12105236 0.038408621 NA
## 63 -7.657813 10 3 -0.15367592 0.049311878 NA
## 33 -7.630226 7 3 -0.14270778 0.045858260 NA
## ModelBetaC ModelInt Function Furthest Closest Statistics Type K
## 232 NA 54.31994 lin 23 0 mean absolute 0
## 31 NA 55.63115 lin 23 0 mean absolute 0
## 23 NA 55.23145 lin 23 0 mean absolute 0
## 25 NA 55.64902 lin 23 0 mean absolute 0
## 63 NA 55.61541 lin 23 0 mean absolute 0
## 33 NA 55.82954 lin 23 0 mean absolute 0
## ModWeight sample.size Reference.day Reference.month Randomised
## 232 0.05020483 80 1 7 no
## 31 0.04472424 80 1 7 no
## 23 0.04138132 80 1 7 no
## 25 0.04120611 80 1 7 no
## 63 0.03693491 80 1 7 no
## 33 0.03642896 80 1 7 no
load(file = paste0(modeldir, "RaRand.Rdata"))
randomized_models <- list()
for (i in 1:(length(RaRand)-1)) {
randomized_models[[i]] <- head(RaRand[[i]])
}
randomized_models
## [[1]]
## deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ
## 277 -0.9112374 23 23 -0.9527015 0.5569915 NA
## 278 0.6014854 23 22 0.7529007 0.6334868 NA
## 279 0.2897328 23 21 0.8301107 0.6321991 NA
## 137 0.4139612 16 16 -0.6475232 0.5119223 NA
## 155 1.5869429 17 16 -0.1926266 0.2942540 NA
## 2771 1.8470358 23 23 -0.2294427 0.5580433 NA
## ModelBetaC ModelInt Function Furthest Closest Statistics Type K
## 277 NA 54.20434 lin 23 0 mean absolute 0
## 278 NA 55.02108 lin 23 0 mean absolute 0
## 279 NA 55.55371 lin 23 0 mean absolute 0
## 137 NA 57.81659 lin 23 0 mean absolute 0
## 155 NA 55.07879 lin 23 0 mean absolute 0
## 2771 NA 54.12201 lin 23 0 mean absolute 0
## ModWeight sample.size Reference.day Reference.month Randomised Repeat
## 277 0.011283030 80 1 7 yes 1
## 278 0.005492493 80 1 7 yes 2
## 279 0.006858946 80 1 7 yes 3
## 137 0.006183120 80 1 7 yes 4
## 155 0.003753598 80 1 7 yes 5
## 2771 0.003604939 80 1 7 yes 6
## WeightDist
## 277 0.9366667
## 278 0.9400000
## 279 0.9400000
## 137 0.9433333
## 155 0.9433333
## 2771 0.9466667
##
## [[2]]
## deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ
## 122 -2.3366078 15 14 0.7166091 0.3434229 NA
## 7 0.6468023 3 3 0.6859561 0.5866389 NA
## 30 -1.8099242 7 6 1.2258196 0.6266928 NA
## 173 0.3181124 18 17 0.7928865 0.6088798 NA
## 254 -1.7942770 22 22 -1.1386158 0.5833074 NA
## 211 -2.7111716 20 20 0.4717214 0.2168967 NA
## ModelBetaC ModelInt Function Furthest Closest Statistics Type K
## 122 NA 52.39326 lin 23 0 mean absolute 0
## 7 NA 44.09754 lin 23 0 mean absolute 0
## 30 NA 40.10981 lin 23 0 mean absolute 0
## 173 NA 52.96927 lin 23 0 mean absolute 0
## 254 NA 48.13238 lin 23 0 mean absolute 0
## 211 NA 58.57218 lin 23 0 mean absolute 0
## ModWeight sample.size Reference.day Reference.month Randomised Repeat
## 122 0.014579566 80 1 7 yes 1
## 7 0.006029902 80 1 7 yes 2
## 30 0.006243374 80 1 7 yes 3
## 173 0.004734644 80 1 7 yes 4
## 254 0.013604576 80 1 7 yes 5
## 211 0.011178410 80 1 7 yes 6
## WeightDist
## 122 0.8966667
## 7 0.9433333
## 30 0.8700000
## 173 0.9233333
## 254 0.9166667
## 211 0.8466667
# Plot residuals against fitted values to check dependency function
# (linear, quadratic, cubic, etc.)
p_res_fit <- list()
for (i in 1:(length(RaWin)-1)) {
bestmod <- RaWin[[i]]$BestModel
# Create residuals vs fitted plot
p_res_fit[[i]] <- ggplot(bestmod, aes(x = .fitted, y = .resid)) +
geom_point(alpha = 0.5) +
geom_hline(yintercept = 0, linetype = 2) +
geom_smooth() +
labs(title = "Residuals vs. Fitted Plot for the best model candidate",
x = "Fitted Values",
y = "Residuals")
}
for (i in 1:length(p_res_fit)) {
plot(p_res_fit[[i]])
}
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
# Estimate how likely our observed result would be at random
# load(file = paste0(modeldir, "RaRand.Rdata"))
pvalues <- list()
for (i in 1:(length(RaRand)-1)) {
pvalues[[i]] <- pvalue(dataset = RaWin[[i]]$Dataset,
datasetrand = RaRand[[i]],
metric = metric,
sample.size = sample.size)
print(pvalues[[i]])
}
## [1] 0.01
## [1] 0.015
# Plot all diagnostic plots for a given parameter combination
plotalls <- list()
for (i in 1:(length(RaWin)-1)) {
RaOutput <- RaWin[[i]]$Dataset
RaRand_data <- RaRand[[i]]
WindowOpen <- RaWin[[i]]$Dataset[1, 2]
WindowClose <- RaWin[[i]]$Dataset[1, 3]
# Fit single best model
RaSingle <- singlewin(xvar = xvar[i],
cdate = cdate,
bdate = bdate,
baseline = baseline,
cinterval = cinterval,
range = c(WindowOpen, WindowClose),
type = type, refday = refday,
stat = stat,
func = func,
spatial = list(df$place_id, daily$place_id))
png(paste0(figuredir, "climwin_Ra_", RaWin$combos$climate[i], ".png"), width = 32 , height = 22,
units = "cm", res = 300)
plotall(dataset = RaOutput,
datasetrand = RaRand_data,
bestmodel = RaSingle$BestModel,
bestmodeldata = RaSingle$BestModelData)
dev.off()
}
## Warning in pvalue(datasetrand = datasetrand, dataset = dataset, metric = "C", :
## Pc will be overly conservative when sample size is greater than 47
## Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(density)` instead.
## ℹ The deprecated feature was likely used in the climwin package.
## Please report the issue at <https://github.com/LiamDBailey/climwin/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning in pvalue(datasetrand = datasetrand, dataset = dataset, metric = "C", :
## Pc will be overly conservative when sample size is greater than 47
# Null model with no climate signal
baseline <- lm(W ~ 1, data = df)
Candidate model fitting
# # Fit candidate model set
# WWin <- slidingwin(xvar = xvar,
# cdate = daily$date,
# bdate = df$date,
# baseline = baseline,
# cinterval = cinterval,
# range = range,
# type = type, refday = refday,
# stat = stat,
# func = func,
# spatial = list(df$place_id, daily$place_id))
#
# save(WWin, file = paste0(modeldir, "WWin.Rdata"))
# rm(WWin)
# gc()
#
# # Fit randomized model set for evaluation purposes
# WRand <- randwin(repeats = repeats,
# xvar = xvar,
# cdate = cdate,
# bdate = bdate,
# baseline = baseline,
# cinterval = cinterval,
# range = range,
# type = type, refday = refday,
# stat = stat,
# func = func,
# spatial = list(df$place_id, daily$place_id))
#
# save(WRand, file = paste0(modeldir, "WRand.Rdata"))
# rm(WRand)
# gc()
Model diagnostics
load(file = paste0(modeldir, "WWin.Rdata"))
# Possible combinations of model parameters
WWin$combos
## response climate type stat func DeltaAICc WindowOpen WindowClose
## 1 W Tmean absolute mean lin -135.21 12 1
## 2 W Tmin absolute mean lin -134.73 11 0
candidate_models <- list()
for (i in 1:(length(WWin)-1)) {
candidate_models[[i]] <- head(WWin[[i]]$Dataset)
}
candidate_models
## [[1]]
## deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ ModelBetaC
## 90 -135.2087 12 1 -2.353106 0.1917869 NA NA
## 89 -130.5201 12 2 -2.173549 0.1805567 NA NA
## 76 -127.3969 11 2 -1.589634 0.1337819 NA NA
## 77 -127.2708 11 1 -1.691312 0.1424147 NA NA
## 75 -114.3714 11 3 -1.334596 0.1189802 NA NA
## 70 -113.6503 11 8 -1.832388 0.1639086 NA NA
## ModelInt Function Furthest Closest Statistics Type K ModWeight
## 90 59.45743 lin 23 0 mean absolute 0 8.810454e-01
## 89 55.58989 lin 23 0 mean absolute 0 8.450666e-02
## 76 48.07177 lin 23 0 mean absolute 0 1.772941e-02
## 77 50.49553 lin 23 0 mean absolute 0 1.664629e-02
## 75 43.36865 lin 23 0 mean absolute 0 2.631709e-05
## 70 43.45472 lin 23 0 mean absolute 0 1.835081e-05
## sample.size Reference.day Reference.month Randomised
## 90 80 1 7 no
## 89 80 1 7 no
## 76 80 1 7 no
## 77 80 1 7 no
## 75 80 1 7 no
## 70 80 1 7 no
##
## [[2]]
## deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ ModelBetaC
## 78 -134.7341 11 0 -1.750979 0.1429826 NA NA
## 27 -133.4188 6 1 -1.358428 0.1115165 NA NA
## 35 -132.7023 7 1 -1.487545 0.1224713 NA NA
## 77 -130.7553 11 1 -1.377617 0.1143278 NA NA
## 34 -130.4450 7 2 -1.323155 0.1099486 NA NA
## 65 -129.7608 10 1 -1.561340 0.1301085 NA NA
## ModelInt Function Furthest Closest Statistics Type K ModWeight
## 78 42.39784 lin 23 0 mean absolute 0 0.43759003
## 27 42.13035 lin 23 0 mean absolute 0 0.22669710
## 35 42.99957 lin 23 0 mean absolute 0 0.15843804
## 77 37.74706 lin 23 0 mean absolute 0 0.05985121
## 34 40.41534 lin 23 0 mean absolute 0 0.05125090
## 65 40.74113 lin 23 0 mean absolute 0 0.03640276
## sample.size Reference.day Reference.month Randomised
## 78 80 1 7 no
## 27 80 1 7 no
## 35 80 1 7 no
## 77 80 1 7 no
## 34 80 1 7 no
## 65 80 1 7 no
load(file = paste0(modeldir, "WRand.Rdata"))
randomized_models <- list()
for (i in 1:(length(WRand)-1)) {
randomized_models[[i]] <- head(WRand[[i]])
}
randomized_models
## [[1]]
## deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ
## 278 1.7392802 23 22 0.7465549 1.4196482 NA
## 1 -1.5723915 0 0 0.8030399 0.4239522 NA
## 237 -6.6168489 21 16 3.7672412 1.2801015 NA
## 137 -0.9839297 16 16 -1.9830729 1.1452334 NA
## 277 -1.8585084 23 23 -2.4535291 1.2464429 NA
## 279 0.4534431 23 21 -1.7686276 1.4158190 NA
## ModelBetaC ModelInt Function Furthest Closest Statistics Type K
## 278 NA 24.596319 lin 23 0 mean absolute 0
## 1 NA 7.754403 lin 23 0 mean absolute 0
## 237 NA 19.715414 lin 23 0 mean absolute 0
## 137 NA 35.073743 lin 23 0 mean absolute 0
## 277 NA 23.958224 lin 23 0 mean absolute 0
## 279 NA 20.572922 lin 23 0 mean absolute 0
## ModWeight sample.size Reference.day Reference.month Randomised Repeat
## 278 0.003538857 80 1 7 yes 1
## 1 0.004939767 80 1 7 yes 2
## 237 0.026472945 80 1 7 yes 3
## 137 0.006690874 80 1 7 yes 4
## 277 0.019390154 80 1 7 yes 5
## 279 0.006391340 80 1 7 yes 6
## WeightDist
## 278 0.9466667
## 1 0.8900000
## 237 0.8600000
## 137 0.9266667
## 277 0.9400000
## 279 0.9433333
##
## [[2]]
## deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ
## 11 -8.7102921 4 4 3.0761509 0.9370696 NA
## 3 1.5598889 1 0 -0.5854590 0.8674344 NA
## 4 -0.0995333 2 2 2.0205142 1.3898830 NA
## 94 -0.5767904 13 11 -2.4589742 1.5276913 NA
## 56 -0.2435277 10 10 0.7737837 0.5150140 NA
## 137 -2.6817542 16 16 -0.9930479 0.4580336 NA
## ModelBetaC ModelInt Function Furthest Closest Statistics Type K
## 11 NA -7.786423 lin 23 0 mean absolute 0
## 3 NA 32.196954 lin 23 0 mean absolute 0
## 4 NA -5.291509 lin 23 0 mean absolute 0
## 94 NA 41.543646 lin 23 0 mean absolute 0
## 56 NA 18.188937 lin 23 0 mean absolute 0
## 137 NA 26.439505 lin 23 0 mean absolute 0
## ModWeight sample.size Reference.day Reference.month Randomised Repeat
## 11 0.026573464 80 1 7 yes 1
## 3 0.003944777 80 1 7 yes 2
## 4 0.006015264 80 1 7 yes 3
## 94 0.006114845 80 1 7 yes 4
## 56 0.007431424 80 1 7 yes 5
## 137 0.010698327 80 1 7 yes 6
## WeightDist
## 11 0.5500000
## 3 0.9466667
## 4 0.9200000
## 94 0.9166667
## 56 0.9300000
## 137 0.8433333
# Plot residuals against fitted values to check dependency function
# (linear, quadratic, cubic, etc.)
p_res_fit <- list()
for (i in 1:(length(WWin)-1)) {
bestmod <- WWin[[i]]$BestModel
# Create residuals vs fitted plot
p_res_fit[[i]] <- ggplot(bestmod, aes(x = .fitted, y = .resid)) +
geom_point(alpha = 0.5) +
geom_hline(yintercept = 0, linetype = 2) +
geom_smooth() +
labs(title = "Residuals vs. Fitted Plot for the best model candidate",
x = "Fitted Values",
y = "Residuals")
}
for (i in 1:length(p_res_fit)) {
plot(p_res_fit[[i]])
}
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
# Estimate how likely our observed result would be at random
# load(file = paste0(modeldir, "WRand.Rdata"))
pvalues <- list()
for (i in 1:(length(WRand)-1)) {
pvalues[[i]] <- pvalue(dataset = WWin[[i]]$Dataset,
datasetrand = WRand[[i]],
metric = metric,
sample.size = sample.size)
print(pvalues[[i]])
}
## [1] "<0.001"
## [1] "<0.001"
# Plot all diagnostic plots for a given parameter combination
plotalls <- list()
for (i in 1:(length(WWin)-1)) {
WOutput <- WWin[[i]]$Dataset
WRand_data <- WRand[[i]]
WindowOpen <- WWin[[i]]$Dataset[1, 2]
WindowClose <- WWin[[i]]$Dataset[1, 3]
# Fit single best model
WSingle <- singlewin(xvar = xvar[i],
cdate = cdate,
bdate = bdate,
baseline = baseline,
cinterval = cinterval,
range = c(WindowOpen, WindowClose),
type = type, refday = refday,
stat = stat,
func = func,
spatial = list(df$place_id, daily$place_id))
png(paste0(figuredir, "climwin_W_", WWin$combos$climate[i], ".png"), width = 32 , height = 22,
units = "cm", res = 300)
plotall(dataset = WOutput,
datasetrand = WRand_data,
bestmodel = WSingle$BestModel,
bestmodeldata = WSingle$BestModelData)
dev.off()
}
## Warning in pvalue(datasetrand = datasetrand, dataset = dataset, metric = "C", :
## Pc will be overly conservative when sample size is greater than 47
## Warning in pvalue(datasetrand = datasetrand, dataset = dataset, metric = "C", :
## Pc will be overly conservative when sample size is greater than 47
# Null model with no climate signal
baseline <- lm(BMI ~ 1, data = df)
Candidate model fitting
# Fit candidate model set
# BMIWin <- slidingwin(xvar = xvar,
# cdate = daily$date,
# bdate = df$date,
# baseline = baseline,
# cinterval = cinterval,
# range = range,
# type = type, refday = refday,
# stat = stat,
# func = func,
# spatial = list(df$place_id, daily$place_id))
#
# save(BMIWin, file = paste0(modeldir, "BMIWin.Rdata"))
# rm(BMIWin)
# gc()
#
# # Fit randomized model set for evaluation purposes
# BMIRand <- randwin(repeats = repeats,
# xvar = xvar,
# cdate = cdate,
# bdate = bdate,
# baseline = baseline,
# cinterval = cinterval,
# range = range,
# type = type, refday = refday,
# stat = stat,
# func = func,
# spatial = list(df$place_id, daily$place_id))
#
# save(BMIRand, file = paste0(modeldir, "BMIRand.Rdata"))
# rm(BMIRand)
# gc()
Model diagnostics
load(file = paste0(modeldir, "BMIWin.Rdata"))
# Possible combinations of model parameters
BMIWin$combos
## response climate type stat func DeltaAICc WindowOpen WindowClose
## 1 BMI Tmean absolute mean lin -174.43 11 1
## 2 BMI Tmin absolute mean lin -178.05 7 1
candidate_models <- list()
for (i in 1:(length(BMIWin)-1)) {
candidate_models[[i]] <- head(BMIWin[[i]]$Dataset)
}
candidate_models
## [[1]]
## deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ
## 77 -174.4295 11 1 -0.07766813 0.005505513 NA
## 90 -172.3767 12 1 -0.10462041 0.007464997 NA
## 91 -160.4069 12 0 -0.12467036 0.009256844 NA
## 35 -158.1235 7 1 -0.06120473 0.004580478 NA
## 78 -156.3795 11 0 -0.08721827 0.006567189 NA
## 27 -155.1897 6 1 -0.05299700 0.004007224 NA
## ModelBetaC ModelInt Function Furthest Closest Statistics Type K
## 77 NA 1.231467 lin 23 0 mean absolute 0
## 90 NA 1.590732 lin 23 0 mean absolute 0
## 91 NA 1.951213 lin 23 0 mean absolute 0
## 35 NA 1.147511 lin 23 0 mean absolute 0
## 78 NA 1.420318 lin 23 0 mean absolute 0
## 27 NA 1.028236 lin 23 0 mean absolute 0
## ModWeight sample.size Reference.day Reference.month Randomised
## 77 7.354500e-01 80 1 7 no
## 90 2.635080e-01 80 1 7 no
## 91 6.631108e-04 80 1 7 no
## 35 2.117104e-04 80 1 7 no
## 78 8.851959e-05 80 1 7 no
## 27 4.882919e-05 80 1 7 no
##
## [[2]]
## deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ
## 35 -178.0494 7 1 -0.06764405 0.004740403 NA
## 27 -177.0630 6 1 -0.06147451 0.004321416 NA
## 54 -165.0565 9 1 -0.06570544 0.004802372 NA
## 36 -164.4469 7 0 -0.08332423 0.006102579 NA
## 28 -164.2161 6 0 -0.07837909 0.005744858 NA
## 66 -161.4739 10 0 -0.08813650 0.006520326 NA
## ModelBetaC ModelInt Function Furthest Closest Statistics Type K
## 35 NA 0.8785793 lin 23 0 mean absolute 0
## 27 NA 0.8350032 lin 23 0 mean absolute 0
## 54 NA 0.7678112 lin 23 0 mean absolute 0
## 36 NA 1.1131405 lin 23 0 mean absolute 0
## 28 NA 1.0911800 lin 23 0 mean absolute 0
## 66 NA 1.0034297 lin 23 0 mean absolute 0
## ModWeight sample.size Reference.day Reference.month Randomised
## 35 0.6193364469 80 1 7 no
## 27 0.3782085551 80 1 7 no
## 54 0.0009344266 80 1 7 no
## 36 0.0006889429 80 1 7 no
## 28 0.0006138386 80 1 7 no
## 66 0.0001558139 80 1 7 no
load(file = paste0(modeldir, "BMIRand.Rdata"))
randomized_models <- list()
for (i in 1:(length(BMIRand)-1)) {
randomized_models[[i]] <- head(BMIRand[[i]])
}
randomized_models
## [[1]]
## deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ
## 278 1.7856261 23 22 0.02720195 0.05668212 NA
## 191 1.4217199 19 19 0.01793895 0.02328707 NA
## 277 -1.1081828 23 23 0.08798754 0.04979054 NA
## 2771 -0.5028966 23 23 -0.07902360 0.04981116 NA
## 29 1.6411011 7 7 0.01903874 0.03111101 NA
## 211 1.6483584 20 20 0.02963739 0.04890527 NA
## ModelBetaC ModelInt Function Furthest Closest Statistics Type K
## 278 NA 0.03342645 lin 23 0 mean absolute 0
## 191 NA -0.01111284 lin 23 0 mean absolute 0
## 277 NA -0.01001583 lin 23 0 mean absolute 0
## 2771 NA 0.00899544 lin 23 0 mean absolute 0
## 29 NA -0.28434623 lin 23 0 mean absolute 0
## 211 NA 0.19839141 lin 23 0 mean absolute 0
## ModWeight sample.size Reference.day Reference.month Randomised Repeat
## 278 0.003652545 80 1 7 yes 1
## 191 0.003972667 80 1 7 yes 2
## 277 0.012082502 80 1 7 yes 3
## 2771 0.010364197 80 1 7 yes 4
## 29 0.003683195 80 1 7 yes 5
## 211 0.003798873 80 1 7 yes 6
## WeightDist
## 278 0.9466667
## 191 0.9433333
## 277 0.9333333
## 2771 0.9400000
## 29 0.9433333
## 211 0.9466667
##
## [[2]]
## deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ
## 37 1.6326127 8 8 -0.02896578 0.04680636 NA
## 47 -0.4875348 9 8 0.06995276 0.04422879 NA
## 193 1.5256214 19 17 -0.02127662 0.03040358 NA
## 277 0.3335591 23 23 0.02560200 0.01975061 NA
## 172 0.5516240 18 18 0.04029047 0.03331770 NA
## 13 -5.0087298 4 2 0.15074927 0.05681653 NA
## ModelBetaC ModelInt Function Furthest Closest Statistics Type K
## 37 NA 0.138121425 lin 23 0 mean absolute 0
## 47 NA -0.424075715 lin 23 0 mean absolute 0
## 193 NA 0.006623042 lin 23 0 mean absolute 0
## 277 NA 0.057488317 lin 23 0 mean absolute 0
## 172 NA -0.080982509 lin 23 0 mean absolute 0
## 13 NA -1.966914276 lin 23 0 mean absolute 0
## ModWeight sample.size Reference.day Reference.month Randomised Repeat
## 37 0.003759648 80 1 7 yes 1
## 47 0.004799659 80 1 7 yes 2
## 193 0.003834957 80 1 7 yes 3
## 277 0.005626720 80 1 7 yes 4
## 172 0.004763455 80 1 7 yes 5
## 13 0.010336698 80 1 7 yes 6
## WeightDist
## 37 0.9433333
## 47 0.8933333
## 193 0.9433333
## 277 0.9300000
## 172 0.9266667
## 13 0.7966667
# Plot residuals against fitted values to check dependency function
# (linear, quadratic, cubic, etc.)
p_res_fit <- list()
for (i in 1:(length(BMIWin)-1)) {
bestmod <- BMIWin[[i]]$BestModel
# Create residuals vs fitted plot
p_res_fit[[i]] <- ggplot(bestmod, aes(x = .fitted, y = .resid)) +
geom_point(alpha = 0.5) +
geom_hline(yintercept = 0, linetype = 2) +
geom_smooth() +
labs(title = "Residuals vs. Fitted Plot for the best model candidate",
x = "Fitted Values",
y = "Residuals")
}
for (i in 1:length(p_res_fit)) {
plot(p_res_fit[[i]])
}
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
# Estimate how likely our observed result would be at random
# load(file = paste0(modeldir, "BMIRand.Rdata"))
pvalues <- list()
for (i in 1:(length(BMIRand)-1)) {
pvalues[[i]] <- pvalue(dataset = BMIWin[[i]]$Dataset,
datasetrand = BMIRand[[i]],
metric = metric,
sample.size = sample.size)
print(pvalues[[i]])
}
## [1] "<0.001"
## [1] "<0.001"
# Plot all diagnostic plots for a given parameter combination
plotalls <- list()
for (i in 1:(length(BMIWin)-1)) {
BMIOutput <- BMIWin[[i]]$Dataset
BMIRand_data <- BMIRand[[i]]
WindowOpen <- BMIWin[[i]]$Dataset[1, 2]
WindowClose <- BMIWin[[i]]$Dataset[1, 3]
# Fit single best model
BMISingle <- singlewin(xvar = xvar[i],
cdate = cdate,
bdate = bdate,
baseline = baseline,
cinterval = cinterval,
range = c(WindowOpen, WindowClose),
type = type, refday = refday,
stat = stat,
func = func,
spatial = list(df$place_id, daily$place_id))
png(paste0(figuredir, "climwin_BMI_", BMIWin$combos$climate[i], ".png"), width = 32 , height = 22,
units = "cm", res = 300)
plotall(dataset = BMIOutput,
datasetrand = BMIRand_data,
bestmodel = BMISingle$BestModel,
bestmodeldata = BMISingle$BestModelData)
dev.off()
}
## Warning in pvalue(datasetrand = datasetrand, dataset = dataset, metric = "C", :
## Pc will be overly conservative when sample size is greater than 47
## Warning in pvalue(datasetrand = datasetrand, dataset = dataset, metric = "C", :
## Pc will be overly conservative when sample size is greater than 47
Currently only average daily precipitation (Prcp) available.
# Set priors
# List of dependent variable candidates
xvar <- list(Prcp = daily$Prcp)
stat <- "sum"
# Null model with no climate signal
baseline <- lm(Ra ~ 1, data = df)
Candidate model fitting
# # Fit candidate model set
# RaWin <- slidingwin(xvar = xvar,
# cdate = daily$date,
# bdate = df$date,
# baseline = baseline,
# cinterval = cinterval,
# range = range,
# type = type, refday = refday,
# stat = stat,
# func = func,
# spatial = list(df$place_id, daily$place_id))
#
# save(RaWin, file = paste0(modeldir, "RaWin_Prcp.Rdata"))
# rm(RaWin)
# gc()
#
# # Fit randomized model set for evaluation purposes
# RaRand <- randwin(repeats = repeats,
# xvar = xvar,
# cdate = cdate,
# bdate = bdate,
# baseline = baseline,
# cinterval = cinterval,
# range = range,
# type = type, refday = refday,
# stat = stat,
# func = func,
# spatial = list(df$place_id, daily$place_id))
#
# save(RaRand, file = paste0(modeldir, "RaRand_Prcp.Rdata"))
# rm(RaRand)
# gc()
Model diagnostics
load(file = paste0(modeldir, "RaWin_Prcp.Rdata"))
# Possible combinations of model parameters
RaWin$combos
## response climate type stat func DeltaAICc WindowOpen WindowClose
## 1 Ra Prcp absolute sum lin -9.48 23 0
candidate_models <- list()
for (i in 1:(length(RaWin)-1)) {
candidate_models[[i]] <- head(RaWin[[i]]$Dataset)
}
candidate_models
## [[1]]
## deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ
## 300 -9.479733 23 0 0.07261819 0.02136251 NA
## 11 -9.375061 4 4 -0.16210635 0.04790806 NA
## 14 -8.885283 4 1 -0.10577676 0.03196058 NA
## 2 -8.564872 1 1 -0.16930650 0.05193061 NA
## 20 -7.766115 5 1 -0.10801827 0.03446747 NA
## 276 -7.208573 22 0 0.06600546 0.02169290 NA
## ModelBetaC ModelInt Function Furthest Closest Statistics Type K
## 300 NA 51.89125 lin 23 0 sum absolute 0
## 11 NA 54.20336 lin 23 0 sum absolute 0
## 14 NA 54.43722 lin 23 0 sum absolute 0
## 2 NA 54.23251 lin 23 0 sum absolute 0
## 20 NA 54.60563 lin 23 0 sum absolute 0
## 276 NA 52.24956 lin 23 0 sum absolute 0
## ModWeight sample.size Reference.day Reference.month Randomised
## 300 0.11063531 80 1 7 no
## 11 0.10499403 80 1 7 no
## 14 0.08218841 80 1 7 no
## 2 0.07002197 80 1 7 no
## 20 0.04696630 80 1 7 no
## 276 0.03554002 80 1 7 no
load(file = paste0(modeldir, "RaRand_Prcp.Rdata"))
randomized_models <- list()
for (i in 1:(length(RaRand)-1)) {
randomized_models[[i]] <- head(RaRand[[i]])
}
randomized_models
## [[1]]
## deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ
## 82 -0.5282470 12 9 -0.5664693 0.3552781 NA
## 105 -2.3651957 13 0 0.4811871 0.2298451 NA
## 24 -0.3677721 6 4 3.9568309 2.5639388 NA
## 58 -2.1946702 10 8 1.0582498 0.5156493 NA
## 232 0.0000000 21 21 NA NA NA
## 2321 0.0000000 21 21 NA NA NA
## ModelBetaC ModelInt Function Furthest Closest Statistics Type K
## 82 NA 59.72266 lin 23 0 sum absolute 0
## 105 NA 43.77419 lin 23 0 sum absolute 0
## 24 NA 46.95370 lin 23 0 sum absolute 0
## 58 NA 44.73846 lin 23 0 sum absolute 0
## 232 NA 54.09590 lin 23 0 sum absolute 0
## 2321 NA 54.09590 lin 23 0 sum absolute 0
## ModWeight sample.size Reference.day Reference.month Randomised Repeat
## 82 0.008196999 80 1 7 yes 1
## 105 0.012706604 80 1 7 yes 2
## 24 0.010422264 80 1 7 yes 3
## 58 0.012919551 80 1 7 yes 4
## 232 0.008365568 80 1 7 yes 5
## 2321 0.006893276 80 1 7 yes 6
## WeightDist
## 82 0.9266667
## 105 0.8800000
## 24 0.9466667
## 58 0.8933333
## 232 0.9433333
## 2321 0.9333333
# Plot residuals against fitted values to check dependency function
# (linear, quadratic, cubic, etc.)
p_res_fit <- list()
for (i in 1:(length(RaWin)-1)) {
bestmod <- RaWin[[i]]$BestModel
# Create residuals vs fitted plot
p_res_fit[[i]] <- ggplot(bestmod, aes(x = .fitted, y = .resid)) +
geom_point(alpha = 0.5) +
geom_hline(yintercept = 0, linetype = 2) +
geom_smooth() +
labs(title = "Residuals vs. Fitted Plot for the best model candidate",
x = "Fitted Values",
y = "Residuals")
}
for (i in 1:length(p_res_fit)) {
plot(p_res_fit[[i]])
}
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
# Estimate how likely our observed result would be at random
# load(file = paste0(modeldir, "RaRand_Prcp.Rdata"))
pvalues <- list()
for (i in 1:(length(RaRand)-1)) {
pvalues[[i]] <- pvalue(dataset = RaWin[[i]]$Dataset,
datasetrand = RaRand[[i]],
metric = metric,
sample.size = sample.size)
print(pvalues[[i]])
}
## [1] 0.01
# Plot all diagnostic plots for a given parameter combination
plotalls <- list()
for (i in 1:(length(RaWin)-1)) {
RaOutput <- RaWin[[i]]$Dataset
RaRand_data <- RaRand[[i]]
WindowOpen <- RaWin[[i]]$Dataset[1, 2]
WindowClose <- RaWin[[i]]$Dataset[1, 3]
# Fit single best model
RaSingle <- singlewin(xvar = xvar[i],
cdate = cdate,
bdate = bdate,
baseline = baseline,
cinterval = cinterval,
range = c(WindowOpen, WindowClose),
type = type, refday = refday,
stat = stat,
func = func,
spatial = list(df$place_id, daily$place_id))
png(paste0(figuredir, "climwin_Ra_", RaWin$combos$climate[i], ".png"), width = 32 , height = 22,
units = "cm", res = 300)
plotall(dataset = RaOutput,
datasetrand = RaRand_data,
bestmodel = RaSingle$BestModel,
bestmodeldata = RaSingle$BestModelData)
dev.off()
}
## Warning in pvalue(datasetrand = datasetrand, dataset = dataset, metric = "C", :
## Pc will be overly conservative when sample size is greater than 47
# Null model with no climate signal
baseline <- lm(W ~ 1, data = df)
Candidate model fitting
# # Fit candidate model set
# WWin <- slidingwin(xvar = xvar,
# cdate = daily$date,
# bdate = df$date,
# baseline = baseline,
# cinterval = cinterval,
# range = range,
# type = type, refday = refday,
# stat = stat,
# func = func,
# spatial = list(df$place_id, daily$place_id))
#
# save(WWin, file = paste0(modeldir, "WWin_Prcp.Rdata"))
# rm(WWin)
# gc()
#
# # Fit randomized model set for evaluation purposes
# WRand <- randwin(repeats = repeats,
# xvar = xvar,
# cdate = cdate,
# bdate = bdate,
# baseline = baseline,
# cinterval = cinterval,
# range = range,
# type = type, refday = refday,
# stat = stat,
# func = func,
# spatial = list(df$place_id, daily$place_id))
#
# save(WRand, file = paste0(modeldir, "WRand_Prcp.Rdata"))
# rm(WRand)
# gc()
Model diagnostics
load(file = paste0(modeldir, "WWin_Prcp.Rdata"))
# Possible combinations of model parameters
WWin$combos
## response climate type stat func DeltaAICc WindowOpen WindowClose
## 1 W Prcp absolute sum lin -123.06 22 22
candidate_models <- list()
for (i in 1:(length(WWin)-1)) {
candidate_models[[i]] <- head(WWin[[i]]$Dataset)
}
candidate_models
## [[1]]
## deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ
## 254 -123.06368 22 22 -1.3253373 0.11362773 NA
## 19 -109.58112 5 2 -1.0983371 0.10016521 NA
## 54 -104.92159 9 1 -0.9966768 0.09300497 NA
## 44 -103.27344 8 1 -0.7751790 0.07294194 NA
## 124 -92.58077 15 12 0.8068519 0.08039843 NA
## 142 -91.84189 16 11 0.7625155 0.07629873 NA
## ModelBetaC ModelInt Function Furthest Closest Statistics Type K
## 254 NA 24.67660 lin 23 0 sum absolute 0
## 19 NA 27.97575 lin 23 0 sum absolute 0
## 54 NA 34.19336 lin 23 0 sum absolute 0
## 44 NA 31.32547 lin 23 0 sum absolute 0
## 124 NA 20.74609 lin 23 0 sum absolute 0
## 142 NA 18.36155 lin 23 0 sum absolute 0
## ModWeight sample.size Reference.day Reference.month Randomised
## 254 9.986548e-01 80 1 7 no
## 19 1.179541e-03 80 1 7 no
## 54 1.147913e-04 80 1 7 no
## 44 5.035219e-05 80 1 7 no
## 124 2.399576e-07 80 1 7 no
## 142 1.658396e-07 80 1 7 no
load(file = paste0(modeldir, "WRand_Prcp.Rdata"))
randomized_models <- list()
for (i in 1:(length(WRand)-1)) {
randomized_models[[i]] <- head(WRand[[i]])
}
randomized_models
## [[1]]
## deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ
## 232 0.0000000 21 21 NA NA NA
## 278 -2.6691459 23 22 2.826055 1.305248 NA
## 2321 0.0000000 21 21 NA NA NA
## 21 -0.6216693 5 0 -2.916436 1.796394 NA
## 2322 0.0000000 21 21 NA NA NA
## 2323 0.0000000 21 21 NA NA NA
## ModelBetaC ModelInt Function Furthest Closest Statistics Type K
## 232 NA 23.67893 lin 23 0 sum absolute 0
## 278 NA 13.46661 lin 23 0 sum absolute 0
## 2321 NA 23.67893 lin 23 0 sum absolute 0
## 21 NA 41.13023 lin 23 0 sum absolute 0
## 2322 NA 23.67893 lin 23 0 sum absolute 0
## 2323 NA 23.67893 lin 23 0 sum absolute 0
## ModWeight sample.size Reference.day Reference.month Randomised Repeat
## 232 0.008746884 80 1 7 yes 1
## 278 0.008070136 80 1 7 yes 2
## 2321 0.008848540 80 1 7 yes 3
## 21 0.006794485 80 1 7 yes 4
## 2322 0.007440463 80 1 7 yes 5
## 2323 0.007443843 80 1 7 yes 6
## WeightDist
## 232 0.9466667
## 278 0.8266667
## 2321 0.9466667
## 21 0.9100000
## 2322 0.9366667
## 2323 0.9366667
# Plot residuals against fitted values to check dependency function
# (linear, quadratic, cubic, etc.)
p_res_fit <- list()
for (i in 1:(length(WWin)-1)) {
bestmod <- WWin[[i]]$BestModel
# Create residuals vs fitted plot
p_res_fit[[i]] <- ggplot(bestmod, aes(x = .fitted, y = .resid)) +
geom_point(alpha = 0.5) +
geom_hline(yintercept = 0, linetype = 2) +
geom_smooth() +
labs(title = "Residuals vs. Fitted Plot for the best model candidate",
x = "Fitted Values",
y = "Residuals")
}
for (i in 1:length(p_res_fit)) {
plot(p_res_fit[[i]])
}
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
# Estimate how likely our observed result would be at random
# load(file = paste0(modeldir, "WRand_Prcp.Rdata"))
pvalues <- list()
for (i in 1:(length(WRand)-1)) {
pvalues[[i]] <- pvalue(dataset = WWin[[i]]$Dataset,
datasetrand = WRand[[i]],
metric = metric,
sample.size = sample.size)
print(pvalues[[i]])
}
## [1] "<0.001"
# Plot all diagnostic plots for a given parameter combination
plotalls <- list()
for (i in 1:(length(WWin)-1)) {
WOutput <- WWin[[i]]$Dataset
WRand_data <- WRand[[i]]
WindowOpen <- WWin[[i]]$Dataset[1, 2]
WindowClose <- WWin[[i]]$Dataset[1, 3]
# Fit single best model
WSingle <- singlewin(xvar = xvar[i],
cdate = cdate,
bdate = bdate,
baseline = baseline,
cinterval = cinterval,
range = c(WindowOpen, WindowClose),
type = type, refday = refday,
stat = stat,
func = func,
spatial = list(df$place_id, daily$place_id))
png(paste0(figuredir, "climwin_W_", WWin$combos$climate[i], ".png"), width = 32 , height = 22,
units = "cm", res = 300)
plotall(dataset = WOutput,
datasetrand = WRand_data,
bestmodel = WSingle$BestModel,
bestmodeldata = WSingle$BestModelData)
dev.off()
}
## Warning in plotwin(dataset = dataset, cw = cwa): Top window has a weight
## greater than 0.95. Plotting single best window only.
## Warning in pvalue(datasetrand = datasetrand, dataset = dataset, metric = "C", :
## Pc will be overly conservative when sample size is greater than 47
# Null model with no climate signal
baseline <- lm(BMI ~ 1, data = df)
Candidate model fitting
# # Fit candidate model set
# BMIWin <- slidingwin(xvar = xvar,
# cdate = daily$date,
# bdate = df$date,
# baseline = baseline,
# cinterval = cinterval,
# range = range,
# type = type, refday = refday,
# stat = stat,
# func = func,
# spatial = list(df$place_id, daily$place_id))
#
# save(BMIWin, file = paste0(modeldir, "BMIWin_Prcp.Rdata"))
# rm(BMIWin)
# gc()
#
# # Fit randomized model set for evaluation purposes
# BMIRand <- randwin(repeats = repeats,
# xvar = xvar,
# cdate = cdate,
# bdate = bdate,
# baseline = baseline,
# cinterval = cinterval,
# range = range,
# type = type, refday = refday,
# stat = stat,
# func = func,
# spatial = list(df$place_id, daily$place_id))
#
# save(BMIRand, file = paste0(modeldir, "BMIRand_Prcp.Rdata"))
# rm(BMIRand)
# gc()
Model diagnostics
load(file = paste0(modeldir, "BMIWin_Prcp.Rdata"))
# Possible combinations of model parameters
BMIWin$combos
## response climate type stat func DeltaAICc WindowOpen WindowClose
## 1 BMI Prcp absolute sum lin -170.76 22 22
candidate_models <- list()
for (i in 1:(length(BMIWin)-1)) {
candidate_models[[i]] <- head(BMIWin[[i]]$Dataset)
}
candidate_models
## [[1]]
## deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ
## 254 -170.7631 22 22 -0.06121892 0.004391032 NA
## 44 -145.9808 8 1 -0.03617652 0.002828411 NA
## 19 -138.6339 5 2 -0.04875724 0.003920471 NA
## 142 -126.1537 16 11 0.03517142 0.002975613 NA
## 41 -123.9901 8 4 -0.02465808 0.002105587 NA
## 43 -121.5037 8 2 -0.03452878 0.002980593 NA
## ModelBetaC ModelInt Function Furthest Closest Statistics Type K
## 254 NA 0.04608336 lin 23 0 sum absolute 0
## 44 NA 0.35685323 lin 23 0 sum absolute 0
## 19 NA 0.19074364 lin 23 0 sum absolute 0
## 142 NA -0.24526696 lin 23 0 sum absolute 0
## 41 NA 0.18001181 lin 23 0 sum absolute 0
## 43 NA 0.31273865 lin 23 0 sum absolute 0
## ModWeight sample.size Reference.day Reference.month Randomised
## 254 9.999957e-01 80 1 7 no
## 44 4.155110e-06 80 1 7 no
## 19 1.054958e-07 80 1 7 no
## 142 2.056803e-10 80 1 7 no
## 41 6.972311e-11 80 1 7 no
## 43 2.011211e-11 80 1 7 no
load(file = paste0(modeldir, "BMIRand_Prcp.Rdata"))
randomized_models <- list()
for (i in 1:(length(BMIRand)-1)) {
randomized_models[[i]] <- head(BMIRand[[i]])
}
randomized_models
## [[1]]
## deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ
## 21 -1.329392 5 0 -0.1311001 0.07168742 NA
## 232 0.000000 21 21 NA NA NA
## 2321 0.000000 21 21 NA NA NA
## 2322 0.000000 21 21 NA NA NA
## 217 -6.687921 20 14 0.1322217 0.04474381 NA
## 2323 0.000000 21 21 NA NA NA
## ModelBetaC ModelInt Function Furthest Closest Statistics Type K
## 21 NA 7.844737e-01 lin 23 0 sum absolute 0
## 232 NA -9.645433e-18 lin 23 0 sum absolute 0
## 2321 NA -9.645433e-18 lin 23 0 sum absolute 0
## 2322 NA -9.645433e-18 lin 23 0 sum absolute 0
## 217 NA -8.521217e-01 lin 23 0 sum absolute 0
## 2323 NA -9.645433e-18 lin 23 0 sum absolute 0
## ModWeight sample.size Reference.day Reference.month Randomised Repeat
## 21 0.009468525 80 1 7 yes 1
## 232 0.006276792 80 1 7 yes 2
## 2321 0.007850904 80 1 7 yes 3
## 2322 0.007588910 80 1 7 yes 4
## 217 0.051714927 80 1 7 yes 5
## 2323 0.008778716 80 1 7 yes 6
## WeightDist
## 21 0.9066667
## 232 0.9266667
## 2321 0.9400000
## 2322 0.9400000
## 217 0.7600000
## 2323 0.9466667
# Plot residuals against fitted values to check dependency function
# (linear, quadratic, cubic, etc.)
p_res_fit <- list()
for (i in 1:(length(BMIWin)-1)) {
bestmod <- BMIWin[[i]]$BestModel
# Create residuals vs fitted plot
p_res_fit[[i]] <- ggplot(bestmod, aes(x = .fitted, y = .resid)) +
geom_point(alpha = 0.5) +
geom_hline(yintercept = 0, linetype = 2) +
geom_smooth() +
labs(title = "Residuals vs. Fitted Plot for the best model candidate",
x = "Fitted Values",
y = "Residuals")
}
for (i in 1:length(p_res_fit)) {
plot(p_res_fit[[i]])
}
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
# Estimate how likely our observed result would be at random
# load(file = paste0(modeldir, "BMIRand_Prcp.Rdata"))
pvalues <- list()
for (i in 1:(length(BMIRand)-1)) {
pvalues[[i]] <- pvalue(dataset = BMIWin[[i]]$Dataset,
datasetrand = BMIRand[[i]],
metric = metric,
sample.size = sample.size)
print(pvalues[[i]])
}
## [1] "<0.001"
# Plot all diagnostic plots for a given parameter combination
plotalls <- list()
for (i in 1:(length(BMIWin)-1)) {
BMIOutput <- BMIWin[[i]]$Dataset
BMIRand_data <- BMIRand[[i]]
WindowOpen <- BMIWin[[i]]$Dataset[1, 2]
WindowClose <- BMIWin[[i]]$Dataset[1, 3]
# Fit single best model
BMISingle <- singlewin(xvar = xvar[i],
cdate = cdate,
bdate = bdate,
baseline = baseline,
cinterval = cinterval,
range = c(WindowOpen, WindowClose),
type = type, refday = refday,
stat = stat,
func = func,
spatial = list(df$place_id, daily$place_id))
png(paste0(figuredir, "climwin_BMI_", BMIWin$combos$climate[i], ".png"), width = 32 , height = 22,
units = "cm", res = 300)
plotall(dataset = BMIOutput,
datasetrand = BMIRand_data,
bestmodel = BMISingle$BestModel,
bestmodeldata = BMISingle$BestModelData)
dev.off()
}
## Warning in plotwin(dataset = dataset, cw = cwa): Top window has a weight
## greater than 0.95. Plotting single best window only.
## Warning in pvalue(datasetrand = datasetrand, dataset = dataset, metric = "C", :
## Pc will be overly conservative when sample size is greater than 47